Mblk-1/E93, an ecdysone related-transcription factor, targets synaptic plasticity-related genes in the honey bee mushroom bodies

Among hymenopteran insects, aculeate species such as bees, ants, and wasps have enlarged and morphologically elaborate mushroom bodies (MBs), a higher-order brain center in the insect, implying their relationship with the advanced behavioral traits of aculeate species. The molecular bases leading to the acquisition of complicated MB functions, however, remains unclear. We previously reported the constitutive and MB-preferential expression of an ecdysone-signaling related transcription factor, Mblk-1/E93, in the honey bee brain. Here, we searched for target genes of Mblk-1 in the worker honey bee MBs using chromatin immunoprecipitation sequence analyses and found that Mblk-1 targets several genes involved in synaptic plasticity, learning, and memory abilities. We also demonstrated that Mblk-1 expression is self-regulated via Mblk-1-binding sites, which are located upstream of Mblk-1. Furthermore, we showed that the number of the Mblk-1-binding motif located upstream of Mblk-1 homologs increased associated with evolution of hymenopteran insects. Our findings suggest that Mblk-1, which has been focused on as a developmental gene transiently induced by ecdysone, has acquired a novel expression pattern to play a role in synaptic plasticity in honey bee MBs, raising a possibility that molecular evolution of Mblk-1 may have partly contributed to the elaboration of MB function in insects.

Luciferase assay. We obtained the coding sequence of Mblk-1 from Mblk-1/pPac-PL 35 and inserted it into a multiple cloning region of an expression vector, pHEK293 Ultra Expression Vector II (TaKaRa). Reporter vectors were constructed by inserting 6xGAGA, 6xGAGA with a 4-bp spacer sequence, 6xATT TTG , 6xGAA ATT TT, 6xATT TTG G, and an Mblk-1-binding region sequence (NC_037652.1:7063408-7063595, 188 bp), respectively, into the upstream region of minimal promoter of pGL4.23 [luc2/minp]. For the Mblk-1-binding region sequence, we used the overlap region of the 2 Mblk-1-binding sequences (NC_037652.1:7063147-7063709 and NC_037652.1:7063400-7063629) obtained from the 2 independent ChIP-seq experiments. pRL-TK was used as an internal control vector to normalize transfection efficiency. The Mblk-1 expression vector or pHEK 293 Ultra Expression Vector II without the Mblk-1 coding sequence as a negative control, pHEK293 Enhancer Vector, one of the reporter vectors (pGL4.23 with the motif sequences), and the internal control vector (pRL-TK) were transfected to HEK293T cells (RCB2202) from RIKEN BioResource Center using Lipofectamine LTX Reagent with PLUS Reagent (Invitrogen). The pHEK293 Enhancer Vector was used to enhance the expression of Mblk-1 in pHEK293 Ultra Expression Vector II. We measured Photinus pyralis (firefly) and Renilla reniformis luciferase activities 40-48 h after transfection with a Dual-Luciferase Reporter Assay System (Promega) and 2030 ARVO X5 (PerkinElmer). Photinus pyralis (firefly) luciferase activity was first normalized with the co-transfected internal control luciferase activity and then values of normalized luciferase activity of the control assay using the pHEK 293 Ultra Expression Vector II without the Mblk-1 coding sequence were subtracted to exclude effects other than those of Mblk-1 expression. The obtained value is shown as relative luminescence on the vertical axis of the graphs. We compared the values from vectors containing candidate sequences with that of the control vector (pGL4.23) to determine how the sequences affect the expression of luciferase.
RNA probe synthesis and in situ hybridization. Total RNA was extracted from 3 honey bee adult worker brains using TRIzol Reagent (Invitrogen). After removing the genomic DNA with the gDNA Eraser from the PrimeScript RT reagent Kit with gDNA Eraser (Perfect Real Time), the extracted RNA was reversetranscribed into cDNA using the same kit. The cDNA was partially amplified by PCR using Ex Taq (TaKaRa) with primers specific for each gene. The PCR products were cloned using pGEM-T Easy Vector Systems (Promega). The primer sequences used to synthesize the in situ hybridization RNA probes are listed in Supplementary Table 1. The DNA templates for in vitro transcription were amplified by PCR with M13 forward and reverse primers. The digoxigenin (DIG)-labeled sense and antisense RNA probes were synthesized using the amplified DNA template, a DIG RNA Labeling Mix (Roche), SP6 RNA polymerase (Roche), and T7 RNA polymerase (Roche). For in situ hybridization, dissected whole brains were embedded in Tissue-Tek OCT compound (Sakura Finetek Japan) without fixation and immediately frozen and stored at − 80 °C until use. The frozen whole brains were sliced into 10-µm thick sections. In situ hybridization was performed essentially as described previously 39 . Images were captured with a light microscope BX-50 (Olympus).

Results
Identification of Mblk-1 target gene candidates in the worker MBs by ChIP-seq analysis. Mblk-1 is expressed in a lKC-preferential manner in adult worker MBs (Fig. 1A) 16,18 . When we searched for highly expressed genes in the honey bee MBs using previously reported RNA-seq data 28 , we found that Mblk-1 was the 38th most highly expressed gene and the most highly expressed among transcription factors in MBs (Fig. 1B)    www.nature.com/scientificreports/ We previously prepared an antibody that binds specifically to Mblk-1 protein (hereafter termed anti-Mblk-1 antibody) (Fig. S1A) 18 . We first tested whether the anti-Mblk-1 antibody could be used for ChIP-seq analysis. In the immunoprecipitated sample, as well as in the input and supernatant samples after immunoprecipitation using worker MB lysate, full-length Mblk-1 protein was mainly detected as a 250-kDa band (Fig. S1B). The molecular weight of Mblk-1 is estimated as 175 kDa based on the protein sequence, but our previous study confirmed that the full-length Mblk-1 protein migrated at the position around 250-kDa in SDS-PAGE analysis 18 . Therefore, the immunoprecipitation results confirmed that the anti-Mblk-1 antibody can be used for immunoprecipitation.
We performed ChIP-seq analysis using homogenates of adult worker MBs and the anti-Mblk-1 antibody. A total of 593 genes were identified as target gene candidates of Mblk-1, in which Mblk-1-binding signals were located within 10 kb upstream or downstream of the corresponding genes (Table 1, Supplementary Data 1, and 3) when using a peak calling q-value threshold of 0.005 and the input DNA sample as a control. Schematic diagrams of some peak signals and their surrounding Mblk-1 target gene candidates are shown in Fig. 2A-F. The signal with the lowest q-value was in the third intron of an uncharacterized non-coding RNA gene (Gene 1 in Table 1 and Fig. 2A). Genes related to neural function, such as suppressor of lurcher protein 1 (Gene 2-1 in Table 1 and Fig. 2B), fasciclin-3 (Gene 4 in Table 1 and Fig. 2C), potassium voltage-gated channel subfamily KQT member 1 (Gene 9 in Table 1 and Fig. 2D), and CaMKII (Gene 17-1 in Table 1 and Fig. 2F) were also identified as Mblk-1 target gene candidates. Moreover, among the Mblk-1 target gene candidates, some ecdysone-signaling pathway-related genes such as ecdysone-induced protein 75 (E75) and Ultraspiracle (Usp) were identified. This is consistent with the fact that E75 is a direct regulatory target of DmE93 21 . Usp was listed in Table 1, and the peak signal was detected in the 5′ untranslated region of this gene (Fig. 2E).
Three distinct DNA binding motifs of Mblk-1. To search for Mblk-1-binding motifs, we next performed motif analysis using sequences corresponding to the Mblk-1-binding regions and MEME-ChIP 34 . Three consensus sequences were identified by DREME, which discovers short and ungapped motifs: GA-rich sequence (GAGA or GAG, hereafter termed a GAGA motif), GAA ATT TT, and AT(C)TTT GTA (Fig. 3A). A combination of these motifs was also found by MEME, which discovers relatively long motifs (Fig. 3B). In a previous study, we identified an Mblk-1 preferred binding sequence named MBE (Mblk-1-binding element, 5′-AGG TAG AGA TCG ATC GAT AGGG-3′) by using an in vitro binding site selection method 17 . The motif we found in this study is consistent with MBE, because MBE has a GAGA sequence. In addition, the ATT TTC (T)GG motif is reported as a binding motif of DmE93 44 , and it is also reported that Bombyx mori E93 (BmE93) binds to GAGA-containing motifs 45 . These previous findings support the quality of our data and suggest that Mblk-1-binding motifs are conserved in insects.
We next performed a luciferase assay to examine whether Mblk-1 recognizes these 3 binding motifs and affects the transcription of a reporter gene. Human HEK293T cells were transfected with an Mblk-1 expression vector; a Renilla luciferase reporter vector; and 5 types of pGL4.23 firefly luciferase reporter vectors, each of which contains 6xGAGA, 6xGAGA with 4-bp spacer sequences, 6xATT TTG , 6xGAA ATT TT, or 6xATT TTG G upstream of the minimal promoter, respectively (Fig. 3C). Normalized firefly luciferase activity increased about 3.0-fold and 6.6-fold when co-transfected with pGL4.23 vectors containing 6xGAGA or 6xGAGA with 4-bp spacer sequences, respectively, compared with that when co-transfected with a pGL4.23 vector containing no binding motifs (Fig. 3D). This finding indicates that Mblk-1 recognizes GAGA motifs and upregulates the transcription of genes with these motifs. Normalized firefly luciferase activity, however, significantly decreased by approximately 60% when co-transfected with pGL4.23 vectors containing 6xGAA ATT TT compared with that when co-transfected with a pGL4.23 vector containing no binding motifs (Fig. 3D). This finding suggests that Mblk-1 recognizes GAA ATT TT motifs to downregulate the transcription of genes having this motif in contrast to those having GAGA motifs. In addition, no significant change in firefly luciferase activity was observed when co-transfected with a pGL4.23 vector containing 6xATT TTG or 6xATT TTG G (Fig. 3D), although the absence of an effect on the luciferase activity does not necessarily exclude the possibility that Mblk-1 binds to these motifs. Taken together, our results indicated that Mblk-1 affects gene transcription through GAGA or GAA ATT TT motifs.

Identification of genes upregulated in lKCs as candidates for Mblk-1 target genes. Given that
Mblk-1 upregulates its target genes in a lKC-preferential manner in the honey bee MBs, Mblk-1 target gene candidates are expected to be expressed preferentially in lKCs. To find genes expressed preferentially in lKCs among the identified Mblk-1 target gene candidates, we used recently reported honey bee MB single-cell RNAseq data 36 . When the MB cells were classified into 10 clusters (Fig. 4A), high expression of PLCe (LOC408804) was observed in clusters 1-8, but not in clusters 0 or 9 (Fig. S2A). We considered clusters 1-8 with high PLCe expression to be KCs, because we previously reported high expression of this gene in all KCs (the entire MBs) 39 . Of these 8 clusters, preferential Mblk-1 expression was observed in clusters 1, 2, 6, and 7 (Fig. 4B). In addition to Mblk-1, genes indicated as preferentially expressed in lKCs by in situ hybridization, such as disks large homolog 5 (LOC410178) 39 and Ryr (LOC408680) 46 , were highly expressed at least in 1 of the 4 clusters (clusters 1, 2, 6, and/or 7) (Fig. S2B,C and Supplementary Data 4). Therefore, we regarded these 4 clusters as lKCs. Preferential expression of middle-type Kenyon cell-preferential arrestin-related protein (LOC725542), which is reported to be expressed in mKCs 47 , was observed in clusters 0, 3, and 8 ( Fig. S2D and Supplementary Data 4). This finding suggests that clusters 3 and 8 corresponded to mKCs. Although the expression level of PLCe in cluster 0 was lower than that in other clusters, the expression level of the pan-neuronal marker ELAV-like protein 2 (LOC410689) was high (Fig. S2E), suggesting that cells in this cluster are non-KCs (possibly optic lobe neurons) that contaminated the MB samples. In cluster 4, the most preferentially expressed gene was POU domain, class 6, transcription factor 2 (LOC408435) (Fig. 4C and Supplementary Data 4), but no known KC-subtype marker  Table 1. Top 20 ChIP-seq peak signals from adult worker MB homogenates and Mblk-1 target gene candidates located within ± 10 kb around the corresponding signals. Gene ID, gene description, signal fold enrichment, − log 10 (q-value), and signal position relative to the genes are shown for each ChIP-seq peak signal. The dashes for the 5th peak signal indicate the absence of genes located within ± 10 kb of the peak signal. Note that side numbers (e.g., 2-1 and 2-2) indicate that multiple genes are located within ± 10 kb of the corresponding peak (e.g., peak 2).
genes were suggested and therefore the known KCs to which this cluster corresponds remain unclear. Among the 8 clusters corresponding to KCs, Tk (tachykinin) appeared to be preferentially expressed in cluster 5 (Fig. 4C). This gene is expressed in both sKCs and lKCs, with greater expression in sKCs than in lKCs 48 . In the violin plot, Tk expression was highest in cluster 5 (Fig. S2F), suggesting that cluster 5 corresponds to sKCs. Genes specifically expressed in Class II KCs have not yet been identified by in situ hybridization analyses 12 , making it difficult to identify which clusters correspond to Class II KCs. Cluster 9 likely comprises doublet cells, because more than half of the cells in this cluster were determined to be doublets by DoubletFinder (Fig. S3). We identified 75 genes as possible Mblk-1 target genes, including Mblk-1 itself, with higher expression in at least one of clusters 1, 2, 6, and 7 than in the other clusters (Supplementary Data 5). Of these 75 genes, 2 genes (CaMKII and pumilio homolog 2) were suggested to be marker genes for 3 of the 4 clusters comprising lKCs ( Fig. 4D and Supplementary Data 4); 19 genes, including CUGBP Elav-like family member 4, were marker genes for 2 of the 4 clusters; and 53 genes, including suppressor of lurcher protein 1 and Usp, were marker genes for 1 of the 4 clusters (Fig. 4D). In situ hybridization analysis of these genes was performed to examine whether they are preferentially expressed in lKCs. CaMKII was confirmed to be expressed preferentially in lKCs (Fig. 4E), consistent with a previous report 13 . In addition, we found preferential expression of pumilio homolog 2 in lKCs (Fig. 4F). CUGBP Elav-like family member 4 was expressed in the whole MBs, but not in other brain regions (Fig. 4G), consistent with the result of scRNA-seq analysis showing that CUGBP Elav-like family member 4 was also a marker gene for clusters other than the 4 clusters, which correspond to lKCs (Supplementary Data 4). Schematic diagrams of the peak signals located around CUGBP Elav-like family member 4 and pumilio homolog 2 are shown in Fig. S4.  Table 1, respectively. Note that there are sometimes overlapping genes for each genomic region, for example, LOC726948, which is annotated as suppressor of lurcher protein 1, overlaps with LOC102655522 and LOC102655477.

Comparison of Mblk-1 target gene candidate profiles between pupal and adult worker honey bee brains.
Mblk-1 expression level is higher in pupal brains than adult brains 18 . In addition, DmE93 is expressed transiently in pupal MB neuroblasts to activate autophagy during metamorphosis 24 . Thus, we next examined whether Mblk-1 has distinct target genes between adult and pupal worker brains in the honey bee. To this end, we performed ChIP-seq analysis using pupal worker brain homogenates and the anti-Mblk-1 antibody. A total of 263 genes were identified as pupal Mblk-1 target gene candidates, for which Mblk-1-binding signals are located within 10 kb upstream or downstream of the corresponding genes when using a peak calling q-value threshold of 0.005 and the input DNA sample as a control. Approximately half of the pupal Mblk-1 target gene candidates (138/263, 52%) were detected specifically in the ChIP-seq analysis using pupal brains (Fig. 5A, Supplementary Data 2 and 3). The Mblk-1-binding signals with the 20 lowest q-values (top 20 signals) and genes that have these signals within 10 kb upstream or downstream are shown in Fig. 5B. Among the 20 pupal target gene candidates shown in Fig. 5B, 8 genes were pupal-specific (8/20, 40%). For almost half of the 20 signals (9/20, 45%), no genes were localized within 10 kb upstream or downstream of the signals. For example, a neural developmental gene, Down syndrome cell adhesion molecule (Dscam) shown in Fig. 5B was not present among the adult target gene candidates, suggesting that this gene is a pupal-specific target. The peak signal was found in the first intron of Dscam, and only the GAGA motif was found in the Mblk-1-binding region. In addition to Dscam, several developmental genes, including Ultrabithorax and Tyrosine-protein kinase transmembrane receptor Ror, were specifically detected in the pupal analysis (Supplementary Data 2). We performed GO enrichment analysis for adult-specific, pupal-specific, or common target gene candidates by Metascape 33 . The target gene candidates used for the GO enrichment analysis are listed in Supplementary Data 6. In our analysis of the adult-specific target gene candidates, generation of neurons (GO:0048699) and open tracheal system development (GO:0007424) were listed as the top 2 (Fig. S5). On the other hand, in our analysis of the pupal-specific target gene candidates, compound eye development (GO:0048749) and urogenital system development (GO:0001655) were suggested as the top 2 (Fig. S6). We also performed GO enrichment analysis using the common target gene candidates and locomotion (GO:0040011) was suggested as the top GO term (Fig. S7).   List of the top 20 ChIP-seq peak signals identified from pupal worker brain homogenates and Mblk-1 target gene candidates located within ± 10 kb of the corresponding signals. The Gene ID, gene description, signal fold enrichment, − log 10 (q-value), and signal position relative to the genes are shown for each ChIP-seq peak signal. Dashes indicate the absence of genes located within ± 10 kb of the corresponding peak signal. Note that side numbers (e.g., 3-1 and 3-2) indicate that multiple genes are located within ± 10 kb of the corresponding peak (e.g., peak 3).  www.nature.com/scientificreports/ Self-regulation of Mblk-1 expression in honey bee MBs. Interestingly, two distinct peak signals were found 2 kb and 25 kb upstream of the Mblk-1 gene, respectively (Fig. 6A). All these signals contained GAGA motifs, which induced the expression of the downstream luciferase gene in the reporter assay (Fig. 3D). From this result, we expected that Mblk-1 induces its own expression via these GAGA motifs. To confirm this, we performed a luciferase assay with a pGL4.23 firefly luciferase reporter vector that contains the sequence corresponding to the peak signal located 25 kb upstream of Mblk-1 gene (Fig. 6A). We observed an approximately 4.9-fold increase in firefly luciferase activity when using the pGL4.23 vector containing the peak signal sequence compared with that when using the pGL4.23 vector containing no binding motifs (Fig. 6B). Therefore, Mblk-1 is suggested to recognize this GAGA-containing region as an enhancer to upregulate its own expression. It is possible that once Mblk-1 is expressed, it can upregulate its own expression, which leads to its constitutive expression in adult honey bee lKCs.

Scientific Reports
Expression analysis of Mblk-1 homolog in the sawfly. In the honey bee, Mblk-1 is almost selectively expressed in the brain among various adult tissues 18 . In addition, in the ant Harpegnathos saltator workers and gamergates, the expression of the Mblk-1 homolog in the brain is observed 49 . We next tested whether the constitutive expression of Mblk-1 homologs in the adult brain is conserved even in other hymenopteran insect species (Fig. 6C). We performed qRT-PCR analysis to examine the expression of the Mblk-1 homolog (LOC105691346) in various adult body parts of the sawfly (Athalia rosae), which is a solitary, phytophagous, and primitive hymenopteran insect species 43 . Although significant expression of the sawfly Mblk-1 homolog was detected in the head without the brain, thorax, and abdomen, the expression in the brain was below the detection threshold (crossing point Cp: 40) ( Table S2). The relative expression levels of the Mblk-1 homolog did not differ significantly among body parts when the brain Cp values were set to 40 (Fig. 6D). These results indicate that the expression level of Mblk-1 homolog is very low in the sawfly brain, and that brain-specific expression, which is observed in the honey bee, is not conserved in the sawfly.
Comparative analysis of the upstream sequences of Mblk-1 homologs. Finally, we investigated how accumulation of the GAGA motif (GAGA and GAG) related to the expression of Mblk-1 homologs in the brains of hymenopteran insects. We counted the number of GAGA motifs located within 2 kb and 10 kb from the transcription start site of each Mblk-1 homolog (Fig. 6E and Fig. S8). In all 4 species of Aculeata, the number of GAG sequences within 2 kb upstream was greater than 100, and the number of GAGA sequences was greater than 30. On the other hand, in Athalia rosae, in which no Mblk-1 homolog expression was observed in adult brains, the number of GAG sequences was 41. In addition, to determine the evolutional stage in hymenopteran insects at which the accumulation of GAGA motifs was acquired, we investigated the numbers of GAGA motifs in the 2 parasitoid wasps, Nasonia vitripennis and Fopius arisanus. The numbers of GAGA motifs in Nasonia vitripennis were comparable to those in Aculeata, while the numbers in Fopius arisanus were closer to those in Athalia rosae. The same trend was observed in the numbers of GAGA motifs within upstream of 10 kb (Fig. S8).

Discussion
Mblk-1 targets genes related to synaptic plasticity and ecdysone-signaling in adult honey bee MBs. In the present study, we newly identified genes related to neural function, such as CaMKII and pumilio homolog 2 as Mblk-1 target gene candidates in the worker honey bee MBs by ChIP-seq analyses. CaMKII is involved in synaptic functions [25][26][27] . In addition, Drosophila melanogaster pumilio has a role in synaptic plasticity and learning ability 50 . Therefore, it is likely that Mblk-1 is involved in neural function, such as learning and memory abilities through transcriptional regulation of these synaptic plasticity-related genes. Several ecdysone-related genes such as Usp were also identified as Mblk-1 target gene candidates. In forager bees (> 21 days), the expression of Usp is stronger in lKCs than in the inner compact KCs, which correspond to mKCs and sKCs 51 . It is likely that Usp is regulated differently between lKCs and inner compact KCs, and its expression in lKCs is controlled by Mblk-1. It is plausible that Usp has some neural functions that differ from its developmental functions like Mblk-1. This is consistent with reports that ecdysone and ecdysone-related genes are involved in neural functions, including learning and memory abilities, in Drosophila melanogaster 52 .
The luciferase assay suggested that the GAGA and GAA ATT TT motifs have different functions in transcriptional regulation by Mblk-1. GAGA and ATT TTG G motifs are suggested to be binding motifs for BmE93 and DmE93, respectively 44,45 , indicating that these two Mblk-1-binding motifs are conserved in insects. To the best of our knowledge, however, only AWT TTY GG and HTTTBGG, both of which are like ATT TTG G, are reported as DmE93-binding motifs in Fly Factor Survey and previous research using ChIP-seq analysis 44 . It is thus likely that Mblk-1 possesses a binding motif (the GAGA motif) that is not observed in DmE93. In the luciferase assay experiments, the expression of a reporter gene was upregulated when using the reporter vector containing the 6xGAGA sequence (Fig. 3D). Therefore, it is probable that Mblk-1 recognizes the GAGA motifs and upregulates the expression of genes around them in lKCs.

Distinct functions of Mblk-1 between pupal and adult brains.
We identified pupal-specific Mblk-1 target gene candidates (Supplementary Data 2), which include some neural developmental genes. Approximately half of the target gene candidates for pupal brains were not identified as Mblk-1 target gene candidates for adult MBs (Fig. 5A) www.nature.com/scientificreports/ from pupae to adults. There are two possible explanations for this change. One is that target genes of Mblk-1 that functions in regions other than MBs, such as OLs and ALs where Mblk-1 is expressed in early pupal stage 39 , were included since we used pupal whole brains for the ChIP-seq analysis. The other possible explanation is that Mblk-1 regulates different genes when it is phosphorylated. Mblk-1 is phosphorylated by mitogen-activated protein kinase 35 in a pupae-specific manner 18 . The anti-Mblk-1 antibody was designed to recognize both nonphosphorylated and phosphorylated Mblk-1 18 . It is thus possible that Mblk-1 changes its binding regions and target genes depending on phosphorylation.
The GO enrichment analysis of adult-specific target gene candidates suggested GO terms related to development, such as generation of neurons (GO:0048699) and open tracheal system development (GO:0007424) (Fig. S5), although the genes used in the analysis were obtained from the ChIP-seq experiment using adult MBs. This finding indicates that Mblk-1 in adult MBs regulates the expression of genes involved in development as well as those related to synaptic plasticity. It is likely that the functions of these developmental genes in adult MBs are different from the ones in developing pupae. Similarly, in the GO enrichment analysis of the pupal-specific target gene candidates, GO terms related to development, such as compound eye development (GO:0048749) and urogenital system development (GO:0001655), were suggested (Fig. S6). Because the genes used in the analysis were obtained from the ChIP-seq experiment using pupal brains, it is reasonable to assume that these genes are involved in neural development. In conclusion from the GO enrichment analyses, many of the Mblk-1 target gene candidates are related to development as expected from their conventional role, but some have neural functions in adult MBs. Self-regulation: a possible model for constitutive expression of Mblk-1 in adult honey bee MBs. Two distinct peak signals were found upstream of the Mblk-1 gene, and the GAGA motif (GAGA or GAG) is enriched in those signal regions. The results of the luciferase assay indicated that the DNA sequence corresponding to one of the peak signals was recognized by Mblk-1 and the expression of its downstream reporter gene was upregulated (Fig. 6B). Generally, in insect metamorphosis, E93 is transiently induced by ecdysone and ecdysone receptors 20 . On the other hand, Mblk-1 is constitutively expressed in honey bee lKCs 18 , whereas ecdysone receptor (EcR) is mainly expressed in sKCs in the honey bee MBs 53 . Therefore, together with the result of the luciferase assay, it is likely that the constitutive expression of Mblk-1 in lKCs is, at least partially, accounted for by self-regulation instead of ecdysone-dependent regulation. Mblk-1 expression seems to be initiated in the pupal stage, because its expression is observed in early pupae (P2), but not in the larval brain 39 . It is likely that Mblk-1 is first induced in the pupal brain under the control of ecdysone-EcR-Usp ternary complex, as suggested by earlier reports 20,21 . We hypothesize that Mblk-1, once activated, induces its own expression autonomously by self-regulation through the pupal to adult stages.
The above luciferase assay finding also suggests that the accumulation of the GAGA motif sequences upstream of Mblk-1 is important in the self-induction of Mblk-1. The numbers of the GAGA motif sequences in species in which the constitutive expression of Mblk-1 homologs in adult brains are observed (Apis mellifera and Harpegnathos saltator 49 ) were much greater than those in Athalia rosae. Consistent with this, the qRT-PCR analysis using sawfly body parts indicated that the Mblk-1 homolog is not selectively expressed in the brains of adult sawflies (Fig. 6D). For Nasonia vitripennis, the GAGA motifs are accumulated upstream of the Mblk-1 homolog, although we have not analyzed whether their Mblk-1 homologs are preferentially expressed in the adult brain. If the GAGA motif accumulation and the constitutive expression of Mblk-1 homologs in brains has been acquired in parasitoid wasps, the role of Mblk-1 homologs related to neural functions, including learning and memory abilities, might have been a platform for the regulation of parasitic behaviors and prerequisite for further advanced behaviors of aculeate species.
Recently, Davie et al. 54 reported the results of scRNA-seq using the adult Drosophila brain. When we investigated the expression of DmE93 using the data by Davie et al. 54 , we found that DmE93 was identified as a cluster marker in 25 among 87 clusters, and 3 of the former 25 clusters correspond to KCs. Therefore, unlike Mblk-1, DmE93 is expressed in broad brain regions, including the whole MBs, in the fly brain. In addition, from regulons suggested in the SCope, DmE93 is not likely to target CaMKII or pum in the fly brain. From the expression pattern of DmE93, it seems that the constitutive expression of E93 in MBs itself is not directly related to the MB function specific to the honey bee. Our present findings suggest that both the autonomous and preferential expression of Mblk-1 in lKCs and its altered binding-specificity are important for the function of Mblk-1 in lKCs in honey bees.
In summary, although Mblk-1/E93 is known as a transcription factor required for metamorphosis in insects [21][22][23][24] , the findings of the present study indicate that the roles of Mblk-1 have expanded to regulate synaptic plasticity. By acquiring new expression mechanisms, binding motifs, and target genes during the evolution of hymenopteran insects, Mblk-1 may contribute to neural development in pupae and the advanced learning and memory abilities in adults. Our study provides an example of molecular evolution that could contribute to the acquisition of novel MB function in insects. Similar mechanisms may be at work in the brains of animals in general, and our findings could contribute to a better understanding of the evolution of brains and behaviors.

Data availability
ChIP-seq data in this study were deposited in the GEO with the accession number GSE173409. The link for data access is as follows: https:// www. ncbi. nlm. nih. gov/ geo/ query/ acc. cgi? acc= GSE17 3409. We declare that all data supporting the findings of this study are available within the article and its supplementary information files.